Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Op Diff #428

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Add Op Diff #428

wants to merge 3 commits into from

Conversation

CelysPr
Copy link
Contributor

@CelysPr CelysPr commented Jan 17, 2025

PR Category

Operator

Type of Change

Add new operator

Description

Implement diff(n) by recursively calling diff(n - 1).
Performance is suboptimal and slow. An alternative implementation using convolution resulted in further performance degradation (speedup < 0.1) and significant accumulation error.

Issue

#393

Progress

  • Performance optimization

Performance

Operator is accurate for larger values of n (tested for n = 0 to 50).
image
Performance on A800 (cloud GPU):
image
image
image
image
image
image

@StrongSpoon
Copy link
Collaborator

I'm not sure if torch.diff belongs to reduction operators. is it possible to implement it as a pointwise one?

@0x45f 0x45f self-assigned this Feb 8, 2025
@0x45f
Copy link
Collaborator

0x45f commented Feb 8, 2025

For the 1d case, I profiled it and it seems that too much time outside the diff_kernel_1d caused the performance to slow down. You can look at these two profile files in zip.
diff-1d-profile.zip
And in torch, the diff op is implemented using aten::narrow and aten::sub. (see: https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/ReduceOps.cpp#L930)

@CelysPr
Copy link
Contributor Author

CelysPr commented Feb 10, 2025

For the 1d case, I profiled it and it seems that too much time outside the diff_kernel_1d caused the performance to slow down. You can look at these two profile files in zip. diff-1d-profile.zip And in torch, the diff op is implemented using aten::narrow and aten::sub. (see: https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/ReduceOps.cpp#L930)

Thank you for pointing out. I will refine my codes in this week.

@0x45f
Copy link
Collaborator

0x45f commented Feb 27, 2025

@CelysPr Have you tried to fix the performance issues recently?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants